AITopics | convex formulation

Collaborating Authors

convex formulation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unveiling Hidden Convexity in Deep Learning: a Sparse Signal Processing Perspective

Zeger, Emi, Pilanci, Mert

arXiv.org Machine LearningMar-26-2026

Deep neural networks (DNNs), particularly those using Rectified Linear Unit (ReLU) activation functions, have achieved remarkable success across diverse machine learning tasks, including image recognition, audio processing, and language modeling. Despite this success, the non-convex nature of DNN loss functions complicates optimization and limits theoretical understanding. In this paper, we highlight how recently developed convex equivalences of ReLU NNs and their connections to sparse signal processing models can address the challenges of training and understanding NNs. Recent research has uncovered several hidden convexities in the loss landscapes of certain NN architectures, notably two-layer ReLU networks and other deeper or varied architectures. This paper seeks to provide an accessible and educational overview that bridges recent advances in the mathematics of deep learning with traditional signal processing, encouraging broader signal processing applications.

artificial intelligence, machine learning, training problem, (19 more...)

arXiv.org Machine Learning

2603.23831

Country:

North America > United States > New York (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

747c1bcceb6109a4ef936bc70cfe67de-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-8-2026, 22:59:35 GMT

reviewer, risk estimator, unlabeled data, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

During rebuttal period, we

Neural Information Processing SystemsOct-3-2025, 06:17:51 GMT

We address your concern as follows. This clearly shows the advantage of our method. We answer your main questions as follows. Q1:"Do we need to commit ourselves to the OVR loss?...considering a loss function such as softmax cross entropy Y ou are absolutely correct! If convexity is not required (e.g., NN implementation), we can use more flexible multiclass loss and binary We will make this clear in the revision. Q2:"How to use the non-negative risk estimator in this problem?" We will add more elaborations about the formulation in the revision. Q3:"My question is have you tried different loss functions?" However, it does not converge in experiments. So we instead use sigmoid loss following Kiryo et al. [24]. Theorem 1 serves as a guide to choose binary loss for OVR scheme. Thus, a consistency guarantee (Theorem 1) is necessary. Thanks for the detailed review and helpful comments. We address your main concerns as follows. For the other minor issues, we will discuss in the paper and revise the paper according to your suggestions. We would like to revise the terminology in the revision if it is allowed. Q2:"Some of the claims made about prior work are not accurate.

rebuttal period, reviewer, risk estimator, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Beyond the Birkhoff Polytope: Convex Relaxations for Vector Permutation Problems

Cong Han Lim, Stephen Wright

Neural Information Processing SystemsOct-2-2025, 22:26:35 GMT

We modify the recent convex formulation of the 2-SUM problem introduced by Fogel et al. [2] to use this polytope, and demonstrate how we can attain results of similar quality in significantly less computational time for large n . To our knowledge, this is the first usage of Goemans' compact formulation of the permutahedron in a convex optimization problem. We also introduce a simpler regularization scheme for this convex formulation of the 2-SUM problem that yields good empirical results.

convex formulation, formulation, permutation, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 19:17:27 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The submission describes a convex deep learning formulation that leverages a number of key ideas. First, a training objective is proposed that explicitly includes the outputs of hidden layers as variables to be inferred via optimization. These are linked to linear responses via a loss function, and the net objective is the sum of these loss functions across the layers, plus some regularization terms. Next, a number of changes of variables are performed in order to reparameterize the objective into a convex form, heavily leveraging the representer theorem and the idea of value regularization. We are left with a convex objective in terms of three different matrices (per layer) to optimize. In particular, one of these matrices is a nonparametric'normalized output kernel' matrix, which takes the place of optimizing over the hidden layer outputs directly; however, this leads to a transductive method where we must simultaneously solve the optimization for training and test inputs.

artificial intelligence, machine learning, objective, (20 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond the Birkhoff Polytope: Convex Relaxations for Vector Permutation Problems

Neural Information Processing SystemsSep-30-2025, 09:26:27 GMT

The Birkhoff polytope (the convex hull of the set of permutation matrices), which is represented using $\Theta(n^2)$ variables and constraints, is frequently invoked in formulating relaxations of optimization problems over permutations. Using a recent construction of Goemans (2010), we show that when optimizing over the convex hull of the permutation vectors (the permutahedron), we can reduce the number of variables and constraints to $\Theta(n \log n)$ in theory and $\Theta(n \log^2 n)$ in practice. We modify the recent convex formulation of the 2-SUM problem introduced by Fogel et al. (2013) to use this polytope, and demonstrate how we can attain results of similar quality in significantly less computational time for large $n$. To our knowledge, this is the first usage of Goemans' compact formulation of the permutahedron in a convex optimization problem. We also introduce a simpler regularization scheme for this convex formulation of the 2-SUM problem that yields good empirical results.

birkhoff polytope, convex relaxation, vector permutation problem, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.64)

Add feedback

Tight convex relaxations for sparse matrix factorization

Neural Information Processing SystemsSep-30-2025, 09:03:02 GMT

Based on a new atomic norm, we propose a new convex formulation for sparse matrix factorization problems in which the number of nonzero elements of the factors is assumed fixed and known. The formulation counts sparse PCA with multiple factors, subspace clustering and low-rank sparse bilinear regression as potential applications. We compute slow rates and an upper bound on the statistical dimension of the suggested norm for rank 1 matrices, showing that its statistical dimension is an order of magnitude smaller than the usual l_1-norm, trace norm and their combinations. Even though our convex formulation is in theory hard and does not lead to provably polynomial time algorithmic schemes, we propose an active set algorithm leveraging the structure of the convex problem to solve it and show promising numerical results.

name change, sparse matrix factorization, tight convex relaxation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

A Convex Formulation of Compliant Contact between Filaments and Rigid Bodies

Li, Wei-Chen, Chou, Glen

arXiv.org Artificial IntelligenceSep-18-2025

Abstract-- We present a computational framework for simulating filaments interacting with rigid bodies through contact. Filaments are challenging to simulate due to their codimen-sionality, i.e., they are one-dimensional structures embedded in three-dimensional space. Existing methods often assume that filaments remain permanently attached to rigid bodies. Our framework unifies discrete elastic rod (DER) modeling, a pressure field patch contact model, and a convex contact formulation to accurately simulate frictional interactions between slender filaments and rigid bodies - capabilities not previously achievable. Owing to the convex formulation of contact, each time step can be solved to global optimality, guaranteeing complementarity between contact velocity and impulse. Finally, we demonstrate its applicability in both soft robotics, such as a stochastic filament-based gripper, and deformable object manipulation, such as shoelace tying, providing a versatile simulator for systems involving complex filament-filament and filament-rigid body interactions.

artificial intelligence, simulation, time step, (17 more...)

arXiv.org Artificial Intelligence

2509.13434

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

A Convex Formulation of Material Points and Rigid Bodies with GPU-Accelerated Async-Coupling for Interactive Simulation

Yu, Chang, Du, Wenxin, Zong, Zeshun, Castro, Alejandro, Jiang, Chenfanfu, Han, Xuchen

arXiv.org Artificial IntelligenceMar-6-2025

We present a novel convex formulation that weakly couples the Material Point Method (MPM) with rigid body dynamics through frictional contact, optimized for efficient GPU parallelization. Our approach features an asynchronous time-splitting scheme to integrate MPM and rigid body dynamics under different time step sizes. We develop a globally convergent quasi-Newton solver tailored for massive parallelization, achieving up to 500x speedup over previous convex formulations without sacrificing stability. Our method enables interactive-rate simulations of robotic manipulation tasks with diverse deformable objects including granular materials and cloth, with strong convergence guarantees. We detail key implementation strategies to maximize performance and validate our approach through rigorous experiments, demonstrating superior speed, accuracy, and stability compared to state-of-the-art MPM simulators for robotics. We make our method available in the open-source robotics toolkit, Drake.

parallelization, simulation, substep, (17 more...)

arXiv.org Artificial Intelligence

2503.05046

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Convex Deep Learning via Normalized Kernels

Özlem Aslan, Xinhua Zhang, Dale Schuurmans

Neural Information Processing SystemsFeb-9-2025, 02:59:36 GMT

Deep learning has been a long standing pursuit in machine learning, which until recently was hampered by unreliable training methods before the discovery of improved heuristics for embedded layer training. A complementary research strategy is to develop alternative modeling architectures that admit efficient training methods while expanding the range of representable structures toward deep models. In this paper, we develop a new architecture for nested nonlinearities that allows arbitrarily deep compositions to be trained to global optimality. The approach admits both parametric and nonparametric forms through the use of normalized kernels to represent each latent layer. The outcome is a fully convex formulation that is able to capture compositions of trainable nonlinear layers to arbitrary depth.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Ontario > Toronto (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback